The possible impact of weather on crimes in Amsterdam#
import pandas as pd
from pandas import Timedelta
import numpy as np
import plotly.express as px
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from IPython.display import display, Markdown
import plotly.io as pio
pio.renderers.default = "notebook"
import plotly.graph_objects as go
df = pd.read_csv("merged_weather_misdrijven_monthly_v3.csv")
People have been studying both human behavior and weather for centuries. One interesting question that brings these two topics together is whether weather conditions can influence the chances of people committing crimes.
Some researchers are convinced that the weather does affect crime rates. For example, Anderson (2001) found that high temperatures can lead to more aggressive behavior. In our research, we look at whether certain weather conditions like heat and precipitation might be linked to changes in the number of reported crimes. We focus on Amsterdam, a densely populated city where detailed data is available.
For the weather data, we used daily measurements from 2012 to 2025 collected by the Royal Netherlands Meteorological Institute (KNMI) at Schiphol Airport. Crime data came from Central Bureau of Statistics Netherlands (CBS) and includes monthly reports for different types of crime in Amsterdam over the same time period. This lets us look at broad questions, like whether serious crimes follow seasonal patterns, as well as more specific ones such as whether warmer weather leads to more incidents like bike theft or public violence.
As we dive into researching the potential patterns, we show visualizations to provide an overview of the influence that temperature may have on different forms of criminal activities. These graphs help lay the foundation for deeper analysis and allow us to explore whether weather truly shapes crime, or whether crime is, at its core, independent of weather conditions.
Perspective 1: Pleasant weather increases the risk of certain crimes#
Pleasant weather often brings people outside, increasing public activity in parks, streets, and recreational areas. This rise in outdoor presence may also create more opportunities for crimes such as bicycle theft, pickpocketing, or alcohol-related incidents. In this section, we examine how sunshine and higher temperatures might correlate with these types of crime.
Argument 1.1 Higher temperatures lead to more bicycle theft#
In Amsterdam, cycling is the main form of transport. Better weather conditions encourage more outside activity such as events or going to the beach, which means more transport and bikes being left outside. This should be the optimal climate for bicycle theft and supports the idea that weather has an influence on crime. To look into this relationship between theft and tropical weather, we inspect temperature and the rates of bicycle theft in the graph below.
df.columns = df.columns.str.strip()
df['TX'] = df['TX'] / 10 # TX is in tienden van °C
df['TG'] = df['TG'] / 10 # TG is in tienden van °C
display(Markdown("### Bicycle Theft vs. Maximum Temperature"))
display(Markdown("_Slight increase in thefts when it's warmer._"))
# 📈 Plot maken
plt.figure(figsize=(8, 5))
sns.regplot(
x='TX',
y='1.2.3 Diefstal van brom-, snor-, fietsen',
data=df,
scatter_kws={'alpha': 0.5}
)
plt.xlabel("Maximum Temperature (°C)")
plt.ylabel("Number of Bicycle Thefts")
plt.grid(True)
plt.tight_layout()
plt.show()
Bicycle Theft vs. Maximum Temperature
Slight increase in thefts when it’s warmer.
This scatterplot shows the relationship between monthly maximum temperature (°C) on the horizontal axis and the number of bicycle thefts reported in the Netherlands on the vertical axis. Each point represents one month. The blue regression line indicates a slight upward trend: bicycle thefts do appear to increase when temperatures are higher. Pearson’s correlation between the two is 0.502, which is considered moderate. To put it more intuitively: every single degree that a day gets warmer, an average of 13.23 more bicycles are stolen. This result supports the first argument, since warmer weather seems to mean more bike theft.
Argument 1.3: Poor visibility encourages burglary#
It is expectable that foggy or dark days may be associated with more home invasions in a given region. The reason being that dark days mean reduced visibility, which makes it the perfect time for burglars to act. The weather may be a criminal’s biggest friend. To actually prove this we need to check the correlation between the number of home invasions and minimum visibility per month.
plt.figure(figsize=(10,6))
sns.regplot(data=df, x='VVN', y='1.1.1 Diefstal/inbraak woning', scatter_kws={'s':50}, line_kws={'color':'red'})
plt.title('Lower Visibility May Be Linked to More Home Burglaries')
plt.xlabel('Average Monthly Minimum Visibility (VVN scale, higher = better visibility)')
plt.ylabel('Total Burglary Incidents per Month')
plt.grid(True)
plt.show()
This scatterplot with a regression line shows the relationship between average monthly minimum visibility (VVN) and the number of reported burglary incidents. The horizontal axis represents visibility, with higher values indicating clearer conditions. The vertical axis shows the total number of burglaries each month. The negative trend suggests that burglary incidents mildly increase in months with lower visibility, supporting the hypothesis that poor visibility creates favorable conditions for intruders. Pearson’s correlation between the two is -0.318 which is considered a low correlation. While the relationship is not strong, the observed trend is consistent with the hypothesis and may indicate a weak association worth further investigation
Perspective 2: Cold or bad weather influences other crimes (or none at all)#
An alternative perspective shows that, although specific categories of crime may be affected by warm weather conditions, others could be more common in colder, darker periods. There might also be crimes that do not show patterns with the weather at all. In this section, we explore how low visibility might relate to burglary, and whether certain types of crime show seasonal patterns, regardless of temperature.
Argument 2.1: Serious crimes are not weather-dependent#
As for more serious crime, generally speaking, there is no obvious reason why it should depend on seasonal weather. Murder, abuse should not be associated with seasonal patterns. Warm days may lead to more activity and opportunity to commit these crimes, while cold days in turn may enable more of a melancholic mental state that motivates them more. Cybercrime sounds like an indoor activity, creating the expectation of more cybercrime in the winter, but no such correlation was found. This supports the claim that crime is not influenced by weather or seasonality.
weer_vars = ['TG', 'RH', 'VVN']
misdaad_vars = [
'1.1.1 Diefstal/inbraak woning',
'1.3.1 Ongevallen (weg)',
'3.4.2 Onder invloed (water)',
'1.4.2 Moord, doodslag',
'2.5.2 Winkeldiefstal',
'1.2.4 Zakkenrollerij',
'1.2.3 Diefstal van brom-, snor-, fietsen'
]
# Maak subset dataframe met weer en misdaad kolommen
df_subset = df[weer_vars + misdaad_vars]
# Bereken correlatie matrix
corr_matrix = df_subset.corr()
# Selecteer alleen correlaties tussen weer_vars (rijen) en misdaad_vars (kolommen)
corr_submatrix = corr_matrix.loc[weer_vars, misdaad_vars]
crime_translation = {
'1.1.1 Diefstal/inbraak woning': 'Burglary (home)',
'1.3.1 Ongevallen (weg)': 'Traffic Accidents',
'3.4.2 Onder invloed (water)': 'Under Influence (water)',
'1.4.2 Moord, doodslag': 'Murder / Manslaughter',
'2.5.2 Winkeldiefstal': 'Shoplifting',
'1.2.4 Zakkenrollerij': 'Pickpocketing',
'1.2.3 Diefstal van brom-, snor-, fietsen': 'Bicycle Theft'
}
corr_submatrix.columns = corr_submatrix.columns.map(crime_translation)
# Optioneel: weer variabelen vertalen (indien gewenst)
weather_translation = {
'TG': 'Avg. Temperature',
'RH': 'Precipitation (mm)',
'VVN': 'Min Visibility (hm)'
}
corr_submatrix.index = corr_submatrix.index.map(weather_translation)
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(12, 6))
sns.heatmap(
corr_submatrix,
annot=True,
cmap='coolwarm',
center=0,
vmin=-0.6,
vmax=0.6
)
plt.title('Correlation between Weather Variables and Crime Categories')
plt.xlabel('Crime Categories')
plt.ylabel('Weather Variables')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()
Argument 2.1: Serious crimes are not weather-dependent#
As for more serious crime, generally speaking, there is no obvious reason why it should depend on seasonal weather. Murder, abuse should not be associated with seasonal patterns. Warm days may lead to more activity and opportunity to commit these crimes, while cold days in turn may enable more of a melancholic mental state that motivates them more. Cybercrime sounds like an indoor activity, creating the expectation of more cybercrime in the winter, but no such correlation was found. This supports the claim that crime is not influenced by weather or seasonality.
# Map each month number to a season
season_map = {
1: 'Winter', 2: 'Winter', 12: 'Winter',
3: 'Spring', 4: 'Spring', 5: 'Spring',
6: 'Summer', 7: 'Summer', 8: 'Summer',
9: 'Autumn', 10: 'Autumn', 11: 'Autumn'
}
# Extract numeric month from 'year_month' and map to season
df['month'] = pd.to_datetime(df['year_month']).dt.month
df['Season'] = df['month'].map(season_map)
# Serious crime columns and their display names
serious_crimes = {
'1.4.2 Moord, doodslag': 'Murder',
'1.4.5 Mishandeling': 'Abuse',
'3.7.4 Cybercrime': 'Cybercrime'
}
# Create pie charts
fig, axes = plt.subplots(1, 3, figsize=(18, 6))
for ax, (col, label) in zip(axes, serious_crimes.items()):
# Total incidents per season for each crime
season_totals = df.groupby('Season')[col].sum().reindex(['Winter', 'Spring', 'Summer', 'Autumn'])
# Pie chart
ax.pie(
season_totals,
labels=season_totals.index,
autopct='%1.1f%%',
startangle=90,
colors=sns.color_palette("pastel")[0:4]
)
ax.set_title(f'Seasonal Distribution of {label}')
# Title and caption
fig.suptitle('Serious Crimes Occur Steadily Across All Seasons', fontsize=16)
plt.tight_layout()
plt.show()
These pie charts show the distribution of monthly incidents of three serious crime categories: murder, abuse, and cybercrime, grouped by season. If these crime types were weather- or season-dependent, one would expect major seasonal differences (or at least noticeable differences between summer and winter). The pie charts reveal relatively even distributions across winter, spring, summer, and autumn. Winter is generally a little bit less frequent, this is because it includes the month February (we divided the seasons by taking three months per season), which has fewer days. With that observation we can conclude that cybercrime might be a little bit more frequent in winter months, but for all three categories no substantial difference can be seen.
Conclusion#
As shown in the previous graphs, crime can have a substantial correlation with the weather if you look at the right categories of crime, like bike theft or substance related incidents on boats. Warmer and brighter days influence our day to day in ways that are sometimes relevant to crime prediction. At the same time, more serious categories of crime do not seem to have a substantial correlation with weather conditions. To illustrate this concisely, we take a look at how multiple types of serious crime relate to the maximum temperature per month in proportion to ‘incidents under influence of substance on water’.
crime_columns = {
'Burglary (home)': '1.1.1 Diefstal/inbraak woning',
'Traffic Accidents':'1.3.1 Ongevallen (weg)',
'Under Influence (water)': '3.4.2 Onder invloed (water)',
'Murder / Manslaughter': '1.4.2 Moord, doodslag',
'Shoplifting': '2.5.2 Winkeldiefstal',
'Pickpocketing': '1.2.4 Zakkenrollerij',
'Bicycle Theft': '1.2.3 Diefstal van brom-, snor-, fietsen'
}
def create_slider_plot(df):
temps = [round(t, 1) for t in list(frange(5.0, 26.5, 0.5))]
fig = go.Figure()
# Voeg alle frames toe, één per temperatuur
frames = []
for temp in temps:
lower, upper = temp - 0.5, temp + 0.5
filtered = df[(df['TX'] >= lower) & (df['TX'] < upper)]
y = [ (filtered[col].sum() / df[col].sum()) * 100 if df[col].sum() > 0 else 0
for col in crime_columns.values() ]
frames.append(go.Frame(
data=[go.Bar(x=list(crime_columns.keys()), y=y)],
name=f"{temp}"
))
# Voeg eerste data toe als initiele trace
fig.add_trace(frames[0].data[0])
# Zet layout, slider en frames
fig.update_layout(
title="Crime Distribution by Temperature",
yaxis=dict(range=[0,25], title="Pct of Total Incidents"),
xaxis_title="Crime Type",
width=800, height=600,
updatemenus=[dict(
type="buttons",
showactive=False,
buttons=[dict(label="Play", method="animate",
args=[None, {"frame": {"duration": 300, "redraw": True},
"fromcurrent": True, "transition": {"duration": 0}}])]
)],
sliders=[dict(
active=temps.index(5.0),
currentvalue={"prefix": "Temp: "},
pad={"t": 50},
steps=[dict(label=f"{t}°C", method="animate", args=[[str(t)], {"frame": {"duration": 0}, "mode": "immediate"}])
for t in temps]
)]
)
fig.frames = frames
return fig
def frange(start, stop, step):
while start <= stop:
yield round(start, 1)
start += step
fig = create_slider_plot(df)
fig.show()
It is clearly visible, that out of this list ‘under influence (water)’ is the only crime type that is really influenced by the temperature. All other categories fluctuate a little, but are relatively evenly distributed across all temperatures.